Skip to content

Conversation

@saethlin
Copy link
Member

@saethlin saethlin commented Oct 23, 2025

The objective of this PR is to improve compilation performance for crates that define a lot of trivial consts. This is a flamegraph of a build of a library crate that is just 100,000 trivial consts, taken from a nightly compiler:
2025-10-25-164005_842x280_scrot
My objective is to target all of the cycles in eval_to_const_value_raw that are not part of mir_built, because if you look at the mir_built for a trivial const, we already have the value available.

In this PR, the definition of a trivial const is this:

const A: usize = 0;

Specifically, we look for if the mir_built body is a single basic block containing one assign statement and a return terminator, where the assign statement assigns an Operand::Constant(Const::Val). The MIR dumps for these look like:

const A: usize = {
    let mut _0: usize;

    bb0: {
        _0 = const 0_usize;
        return;
    }
}

The implementation is built around a new query, trivial_const(LocalDefId) -> Option<(ConstValue, Ty)> which returns the contents of the Const::Val in the mir_built if the LocalDefId is a trivial const.

Then I added debug assertions to the beginning of mir_for_ctfe and mir_promoted to prevent trying to get the body of a trivial const, because that would defeat the optimization here. But these are deliberately debug assertions because the consequence of failing the assertion is that compilation is slow, not corrupt. If we made these hard assertions, I'm sure there are obscure scenarios people will run into where the compiler would ICE instead of continuing on compilation, just a bit slower. I'd like to know about those, but I do not think serving up an ICE is worth it.

With the assertions in place, I just added logic around all the places they were hit, to skip over trying to analyze the bodies of trivial consts.

In the future, I'd like to see this work extended by:

  • Pushing detection of trivial consts before MIR building
  • Including DefKind::Static and DefKind::InlineConst
  • Including consts like _1 = const 0_usize; _0 = &_1, which would make a lot of promoteds into trivial consts
  • Handling less-trivial consts like const A: usize = B, which have Operand::Constant(Const::Unevaluated)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 23, 2025
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 23, 2025
Add a fast path for lowering trivial consts
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 23, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 23, 2025

☀️ Try build successful (CI)
Build commit: 4c15d20 (4c15d2003befc82fb2064960f5520c8643947469, parent: 6501e64fcb02d22b49d6e59d10a7692ec8095619)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4c15d20): comparison URL.

Overall result: ❌✅ regressions and improvements - BENCHMARK(S) FAILED

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

❗ ❗ ❗ ❗ ❗
Warning ⚠️: The following benchmark(s) failed to build:

  • serde-1.0.219-threads4

❗ ❗ ❗ ❗ ❗

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.9%] 77
Regressions ❌
(secondary)
0.5% [0.0%, 1.6%] 29
Improvements ✅
(primary)
-2.6% [-5.6%, -1.8%] 13
Improvements ✅
(secondary)
-2.5% [-2.8%, -2.3%] 3
All ❌✅ (primary) -0.1% [-5.6%, 0.9%] 90

Max RSS (memory usage)

Results (primary -0.2%, secondary 2.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.7% [0.4%, 1.3%] 7
Regressions ❌
(secondary)
7.1% [4.6%, 8.2%] 4
Improvements ✅
(primary)
-1.7% [-2.0%, -1.5%] 4
Improvements ✅
(secondary)
-1.3% [-2.2%, -0.8%] 5
All ❌✅ (primary) -0.2% [-2.0%, 1.3%] 11

Cycles

Results (primary -1.5%, secondary -0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.5% [3.5%, 3.5%] 1
Regressions ❌
(secondary)
3.0% [2.8%, 3.3%] 3
Improvements ✅
(primary)
-2.5% [-3.2%, -2.0%] 5
Improvements ✅
(secondary)
-3.0% [-6.0%, -1.7%] 4
All ❌✅ (primary) -1.5% [-3.2%, 3.5%] 6

Binary size

Results (primary -0.6%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.8%] 72
Regressions ❌
(secondary)
0.0% [0.0%, 0.2%] 14
Improvements ✅
(primary)
-8.9% [-8.9%, -8.8%] 8
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 2
All ❌✅ (primary) -0.6% [-8.9%, 0.8%] 80

Bootstrap: 476.496s -> 475.064s (-0.30%)
Artifact size: 390.49 MiB -> 390.55 MiB (0.02%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 24, 2025
@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 24, 2025
Add a fast path for lowering trivial consts
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 24, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 24, 2025

☀️ Try build successful (CI)
Build commit: 4931b5e (4931b5e2c7edb857fab6e19e39dd4e4e22a37a91, parent: ab925646fae038b02bd462cd328ae9eef1639236)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4931b5e): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.1%, 0.9%] 67
Regressions ❌
(secondary)
0.5% [0.0%, 1.6%] 32
Improvements ✅
(primary)
-9.7% [-15.6%, -5.5%] 13
Improvements ✅
(secondary)
-2.5% [-2.8%, -2.4%] 3
All ❌✅ (primary) -1.3% [-15.6%, 0.9%] 80

Max RSS (memory usage)

Results (primary -0.8%, secondary -1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.0% [0.5%, 3.5%] 11
Regressions ❌
(secondary)
4.9% [0.8%, 8.0%] 6
Improvements ✅
(primary)
-2.8% [-4.0%, -1.3%] 10
Improvements ✅
(secondary)
-4.1% [-5.4%, -2.4%] 12
All ❌✅ (primary) -0.8% [-4.0%, 3.5%] 21

Cycles

Results (primary -9.2%, secondary -1.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.1% [0.9%, 10.7%] 6
Improvements ✅
(primary)
-9.2% [-14.1%, -4.2%] 13
Improvements ✅
(secondary)
-4.2% [-8.6%, -2.1%] 13
All ❌✅ (primary) -9.2% [-14.1%, -4.2%] 13

Binary size

Results (primary -0.7%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.8%] 72
Regressions ❌
(secondary)
0.1% [0.0%, 0.2%] 22
Improvements ✅
(primary)
-9.0% [-9.1%, -9.0%] 8
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 2
All ❌✅ (primary) -0.7% [-9.1%, 0.8%] 80

Bootstrap: 474.337s -> 474.199s (-0.03%)
Artifact size: 390.48 MiB -> 390.80 MiB (0.08%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 25, 2025
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 25, 2025
Add a fast path for lowering trivial consts
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 25, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 25, 2025

☀️ Try build successful (CI)
Build commit: 8907237 (8907237f16e88f2a9ee9a48ba052804abac7e239, parent: f435972085b697a1ece8ee6a1ac76efff8d1df7b)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8907237): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 16
Regressions ❌
(secondary)
0.6% [0.0%, 1.6%] 21
Improvements ✅
(primary)
-3.7% [-15.8%, -0.1%] 71
Improvements ✅
(secondary)
-4.3% [-8.1%, -0.2%] 20
All ❌✅ (primary) -3.0% [-15.8%, 0.4%] 87

Max RSS (memory usage)

Results (primary -3.7%, secondary 2.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.3% [1.0%, 1.5%] 3
Regressions ❌
(secondary)
5.4% [1.6%, 8.2%] 6
Improvements ✅
(primary)
-4.5% [-8.6%, -0.6%] 20
Improvements ✅
(secondary)
-1.5% [-1.8%, -1.0%] 6
All ❌✅ (primary) -3.7% [-8.6%, 1.5%] 23

Cycles

Results (primary -7.5%, secondary -3.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.5% [1.5%, 1.5%] 1
Regressions ❌
(secondary)
4.3% [3.6%, 5.8%] 4
Improvements ✅
(primary)
-7.8% [-15.2%, -2.5%] 27
Improvements ✅
(secondary)
-6.5% [-8.7%, -5.1%] 10
All ❌✅ (primary) -7.5% [-15.2%, 1.5%] 28

Binary size

Results (primary -1.7%, secondary -3.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.7%] 64
Regressions ❌
(secondary)
0.1% [0.0%, 0.3%] 20
Improvements ✅
(primary)
-4.9% [-12.5%, -0.9%] 40
Improvements ✅
(secondary)
-9.5% [-14.9%, -0.0%] 12
All ❌✅ (primary) -1.7% [-12.5%, 0.7%] 104

Bootstrap: 474.979s -> 473.593s (-0.29%)
Artifact size: 390.46 MiB -> 390.54 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 25, 2025
| DefKind::AnonConst
) && trivial_const(&body).is_some()
{
return tcx.alloc_steal_mir(body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we building the Mir for trivial consts?

Copy link
Member Author

@saethlin saethlin Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial assessment indicated that most of the savings were from what we do after building MIR, and detecting trivial consts on THIR looked hard.

I think that pushing this detection earlier in compilation would make sense as a later extension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. Pls drop that as a comment right here

@oli-obk
Copy link
Contributor

oli-obk commented Oct 25, 2025

Even the instruction count regressions (non-incremental) seem to be an improvement in time measured. I don't think this PR needs more performance tuning, as it's somewhat expected that incremental may do more work now due to the extra query, but often still is an improvement

@saethlin
Copy link
Member Author

saethlin commented Oct 25, 2025

I agree that the performance is pretty good. Also, all of my attempts are reducing the perf overhead have been completely ineffectual. The latest report looks better than the one before it because I let the optimization apply to more DefKinds.

I'm cleaning up the code so that it integrates a bit better in ctfe, which will probably make Ralf happier and also make this easier to extend.

@saethlin
Copy link
Member Author

r? oli-obk

@rustbot
Copy link
Collaborator

rustbot commented Oct 25, 2025

oli-obk is not on the review rotation at the moment.
They may take a while to respond.

@saethlin saethlin marked this pull request as ready for review October 25, 2025 21:02
@rustbot
Copy link
Collaborator

rustbot commented Oct 25, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 25, 2025
matches!(tcx.def_kind(def), DefKind::AssocConst | DefKind::Const | DefKind::AnonConst)
}

fn trivial_const_provider<'tcx>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems worth moving this query impl into a separate file, to avoid growing this already-big file even bigger.

Also please add more comments, in particular explaining the overall contract this query must abide by.

tcx.ensure_done().coroutine_by_move_body_def_id(def);
}

// the `trivial_const` query uses mir_built, so make sure it is run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mir_built also uses trivial_const, so I am confused... this sounds cyclic?

// Trying to push this logic earlier in the compiler and never even produce the Body would
// probably improve compile time.
if def_kind_compatible_with_trivial_mir(tcx, def) && trivial_const(&body).is_some() {
let body = tcx.alloc_steal_mir(body);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let body = tcx.alloc_steal_mir(body);
// Skip all the passes below for trivial consts.
let body = tcx.alloc_steal_mir(body);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants